Scalable Distributed Reasoning Using MapReduce
نویسندگان
چکیده
We address the problem of scalable distributed reasoning, proposing a technique for materialising the closure of an RDF graph based on MapReduce. We have implemented our approach on top of Hadoop and deployed it on a compute cluster of up to 64 commodity machines. We show that a naive implementation on top of MapReduce is straightforward but performs badly and we present several non-trivial optimisations. Our algorithm is scalable and allows us to compute the RDFS closure of 865M triples from the Web (producing 30B triples) in less than two hours, faster than any other published approach.
منابع مشابه
A Scalable RDF Data Processing Framework based on Pig and Hadoop
In order to effectively handle the growing amount of available RDF data, scalable and flexible RDF data processing frameworks are needed. While emerging technologies for Big Data, such as Hadoop-based systems that take advantages of scalable and fault-tolerant distributed processing, based on Google’s distributed file system and MapReduce parallel model, have become available, there are still m...
متن کاملDistributed RDFS Reasoning with MapReduce
We live in big data age in which many computational tasks either generate or need to use large datasets. This makes parallel and distributed computing a key for scalability. MapReduce is a programming model for processing large datasets in parallel and distributed fashion on cluster of computers. Today, since the size and complexity of RDFS documents increase rapidly, RDFS reasoning problem has...
متن کاملScalable Nonmonotonic Reasoning over RDF data using MapReduce
In this paper, we are presenting a scalable method for nonmonotonic rule-based reasoning over Semantic Web Data, using MapReduce. Our work is motivated by the recent unparalleled explosion of available data coming from the Web, sensor readings, databases, ontologies and more. Such datasets could benefit from the introduction of rule sets encoding commonly accepted rules or facts, applicationor ...
متن کاملWebPIE: A Web-scale parallel inference engine using MapReduce
The large amount of Semantic Web data and its fast growth pose a significant computational challenge in performing efficient and scalable reasoning. On a large scale, the resources of single machines are no longer sufficient and we are required to distribute the process to improve performance. In this article, we propose a distributed technique to perform materialization under the RDFS and OWL ...
متن کاملLarge Scale Fuzzy pD * Reasoning Using MapReduce
The MapReduce framework has proved to be very efficient for data-intensive tasks. Earlier work has tried to use MapReduce for large scale reasoning for pD∗ semantics and has shown promising results. In this paper, we move a step forward to consider scalable reasoning on top of semantic data under fuzzy pD∗ semantics (i.e., an extension of OWL pD∗ semantics with fuzzy vagueness). To the best of ...
متن کامل